Revising Perceptual Linear Prediction (PLP)

نویسندگان

  • Florian Hönig
  • Georg Stemmer
  • Christian Hacker
  • Fabio Brugnara
چکیده

Mel Frequency Cepstral Coefficients (MFCC) and Perceptual Linear Prediction (PLP) are the most popular acoustic features used in speech recognition. Often it depends on the task, which of the two methods leads to a better performance. In this work we develop acoustic features that combine the advantages of MFCC and PLP. Based on the observation that the techniques have many similarities, we revise the processing steps of PLP. In particular, the filter-bank, the equal-loudness pre-emphasis and the input for the linear prediction are improved. It is shown for a broadcast news transcription task and a corpus of children’s speech that the new variant of PLP performs better than both MFCC and conventional PLP for a wide range of clean and noisy acoustic conditions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Identification Using Gaussian Mixture Models

In this paper, the performance of Perceptual Linear Prediction (PLP) features has been compared with the performance of Linear Prediction Coefficient (LPC) features for speaker identification. Two classification techniques, Gaussian Mixture Models (GMM) and Vector Quantization (VQ) with Dynamic time wrapping (DTW) are used for classification of speakers based on their speech samples into respec...

متن کامل

MRASTA and PLP in automatic speech recognition

This work explores different methods for combining estimated posterior probabilities from Multi-RASTA (MRASTA) and Perceptual Linear Prediction (PLP) features for Automatic Speech Recognition (ASR). The improved performance by the ASR system indicates the complementary nature of information present in MRASTA and PLP. Among the different combining methods explored, product gives best performance.

متن کامل

PLP 2 Autoregressive modeling of auditory - like 2 - D spectro - temporal patterns

The temporal trajectories of the spectral energy in auditory critical bands over 250 ms segments are approximated by an all-pole model, the time-domain dual of conventional linear prediction. This quarter-second auditory spectro-temporal pattern is further smoothed by iterative alternation of spectral and temporal all-pole modeling. Just as Perceptual Linear Prediction (PLP) uses an autoregress...

متن کامل

Perceptual Analysis of Speech Signals from People with Parkinson's Disease

Parkinson’s disease (PD) is a neurodegenerative disorder of the nervous central system and it affects the limbs motor control and the communication skills of the patients. The evolution of the disease can get to the point of affecting the intelligibility of the patient’s speech. The treatments of the PD are mainly focused on improving limb symptoms and their impact on speech production is still...

متن کامل

Voiceprint analysis using Perceptual Linear Prediction and Support Vector Machines for detecting persons with Parkinson’s disease

In the aim of developing the assessment of speech disorders for detecting patients with Parkinson’s disease (PD), we have collected 34 sustained vowel / a /, from 34 subjects including 17 PD patients. We subsequently extracted from 1 to 20 coefficients of the Perceptual Linear Prediction (PLP) from each individual. To extract the voiceprint from each individual, we compressed the frames by calc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005